Methods for the Classification of Data from Open-Ended Questions in Surveys

Disputation
16 April 2024

Camille Landesvatter

University of Mannheim

Research Questions and Motivation

Which methods can we use to classify data from open-ended survey questions?
Can we leverage these methods to make empirical contributions to substantial questions?

Motivation:

➡️ The increase in methods to collect natural language (e.g., smartphone surveys and voice technologies) calls for testing and validating automated methods to analyze the resulting data.

➡️ Open-ended survey answers pose a unique challenge for ML applications due to their shortness and lack of context. An effective analysis might require the use of suitable methods, e.g., word embeddings, structural topic models.

Characteristics of Open-Ended Survey Answers

Figure 1: The previous question was: ‘How often can you trust the federal government in Washington to do what is right?’. Your answer was: ‘[Always; Most of the time; About half of the time; Some of the time; Never; Don’t Know]’. In your own words, please explain why you selected this answer.

Structure and Approach of the dissertation

  1. Introducing readers to the survey methodology of using open-ended questions (OEQs)
    • including historical and modern developments, characteristics and challenges of open-ended questions, types of OEQs (e.g., probing)
  1. Introducing readers to computational methods available for analysis of open-ended answers
    • manual, semi-automated, fully automated
  1. Applying the available methods in three empirical studies

Methods for Analyzing Data from Open-Ended Questions

Table 1. Overview of methods for classifying open-ended survey responses

Studies

Overview

Study 1 Study 2 Study 3
How valid are trust survey measures? New insights from open-ended probing data and supervised machine learning Open-ended survey questions: A comparison of information content in text and audio response format Asking Why: Is there an Affective Component of Political Trust Ratings in Surveys?
  • Data:
    • three self-administered web surveys with open-ended questions + U.S. non-probability samples
  • Methodology for text classification:
    • supervised ML, unsupervised ML, fine-tuning of pre-trained language model BERT, zero-shot learning

How valid are trust survey measures? New insights from open-ended probing data and supervised machine learning

Landesvatter, C., & Bauer, P. C. (2024). How Valid Are Trust Survey Measures? New Insights From Open-Ended Probing Data and Supervised Machine Learning. Sociological Methods & Research, 0(0). https://doi.org/10.1177/00491241241234871

Study 1: Background

  • Background:
    • ongoing debates about which type of trust survey researchers are measuring with traditional survey items (i.e., equivalence debate cf. Bauer & Freitag 2018)
  • Research Question:
    • How valid are traditional trust survey measures?
  • Experimental Design:
    • block randomized question order where seven closed-ended questions are followed by open-ended follow-up probing questions

Study 1: Methodology

  • Operationalization via two classifications: share of known vs. unknown others in associations (I), sentiment (pos-neu-neg) of assocations (II)
  • Supervised classification approach:
      1. manual labeling of randomly sampled documents (n=[1,000,1,500])
      1. fine-tuning the weights of two BERT1 models (base model uncased version), using the manually coded data as training data, to classify the remaining=[6,500/6,000]
    • accuracy2: 87% (I) and 95% (II)

Study 1: Results

Figure 1: Illustration of exemplary data. Note: n=7,497.
Figure 2: Trust Scores by Associations for the Most People Question.
Note: CIs are 90% and 95%, n=1,499.

Open-ended survey questions: A comparison of information content in text and audio response formats

Landesvatter, C., & Bauer, P. C. (February 2024). Open-ended survey questions: A comparison of information content in text and audio response formats. Working Paper submitted to Public Opinion Quarterly.

Study 2: Background

  • Background:
    • recent increase of voice-based response options in surveys due to mobile devices equipped with voice input technologies, smartphone surveys and speech-to-text technologies
  • Research Question:
    • Are there differences in information content between responses given in voice and text formats?
  • Experimental Design:
    • block randomized question order with open-ended and probing questions
    • random assignment into either the text or voice condition

Study 2: Methodology

  • Operationalization via application of measures from information theory and machine learning to classify open-ended survey answers
    • number of topics, response entropy
    • response length

Study 2: Results

Figure 3: Information Content Measures across Questions.
Note. CIs are 95%, n_vote-choice: 830 (audio: 225, text: 605), n_future-children: 1,337 (audio: 389, text: 748)

Asking Why: Is there an Affective Component of Political Trust Ratings in Surveys?

Landesvatter, C., & Bauer, P. C. (March 2024). Asking Why: Is there an Affective Component of Political Trust Ratings in Surveys?. Working Paper submitted to American Political Science Review.

Study 3: Background

  • Background:
    • conventional notion stating tha trust originates from informed, rational, and consequential judgments is challenged by the idea of an “affective-based” form of (political) trust
  • Research Question:
    • Are individual trust judgments in surveys driven by affective rationales?
  • Questionnaire Design:
    • closed-ended political trust question followed by open-ended probing question

Study 3: Methodology

  • Operationalization via sentiment and emotion analysis

  • Transcript-based

    • pysentimiento for sentiment recognition (Pérez et al. 2023)
    • zero-shot prompting with GPT-3.5
  • Speech-based

    • SpeechBrain for Speech Emotion Recognition (Ravanelli et al. 2021)

Study 3: Results

Figure 4: Emotion Recognition for Speech Data with SpeechBrain. Note. CIs are 95%, n_neutral=408, n_anger=44, n_sadness=18, n_happiness=21.

Summary

  • Web surveys allow to collect narrative answers that provide valuable insights into survey responses
  • Various modern developments (smartphone surveys, speech-to-text algorithms) can be leveraged to collect such data in innovative ways (e.g., spoken answers) (depends on goals! e.g., population)
  • Natural language can be analyzed with computational measures which can inform debates in different fields, e.g.:
    • Study 1: equivalence debate in trust research
    • Study 3: cognitive-versus-affective debate in political trust research
    • Study 2: survey questionnaire design or item and data quality in general (e.g., associations, sentiment, emotions) (Study 1-3)

Conclusion: Machine Learning and Open-ended Answers

Facilitated accessibility and implementation of semi-automated methods.
  • large and general aim pre-trained models (e.g., BERT, GPT) allow less resource-intensive fine-tuning (compared to traditional supervised models)

    • Study 1: Manually labeling ~13% (n=1,000) documents for fine-tuning resulted in sufficient accuracy ( 87%)
    • Increasing the number of manually labeled documents can help in terms of accuracy (i.e., 92%) (and transparency)
  • But: these models come along a lack of transparency

    • always start with simple methods and evaluate
      • Study 1: Random Forest -> BERT
    • accuracy vs. transparency trade-off

Conclusion

Machine Learning and Open-Ended Answers

Increase in possibilities of fully automated methods (e.g., prompt engineering.
  • fully automated methods, such as zero-shot prompting can keep up with fine-tuned versions of pre-trained models (e.g., pysentimiento, Study 2)
    • deciding on a suitable number of manual examples and for a method in general (e.g., fully automated (unsupervised) versus semi-automated (supervised, and finetuning) depends on resources such as expected difficulty, desired accuracy, available time and cost resources

  • final decision depends on: difficulty of the given task,

Thank you for your Attention!

References